Exact algorithms for planted motif challenge problems
نویسندگان
چکیده
The problem of identifying meaningful patterns (i.e., motifs) from biological data has been studied extensively due to its paramount importance. Three versions of this problem have been identified in the literature. One of these three problems is the planted (l, d)-motif problem. Several instances of this problem have been posed as a challenge. Numerous algorithms have been proposed in the literature that address this challenge. Many of these algorithms fall under the category of approximation algorithms. In this paper we present algorithms for the planted (l, d)-motif problem that always find the correct answer(s). Our algorithms are very simple and are based on some ideas that are fundamentally different from the ones employed in the literature. We believe that the techniques we introduce in this paper will find independent applications. This research has been supported in part by the NSF Grants CCR-9912395 and ITR-0326155.
منابع مشابه
Exact Algorithms for Planted Motif Problems CONTACT AUTHOR:
The problem of identifying meaningful patterns (i.e., motifs) from biological data has been studied extensively due to its paramount importance. Three versions of this problem have been identified in the literature. One of these three problems is the planted (l, d)-motif problem. Several instances of this problem have been posed as a challenge. Numerous algorithms have been proposed in the lite...
متن کاملSpace and Time Efficient Algorithms for Planted Motif Search
We consider the (l, d) Planted Motif Search Problem, a problem that arises from the need to find transcription factor-binding sites in genomic information. We propose the algorithms PMSi and PMSP which are based on ideas considered in PMS1 [6]. These algorithms are exact, make use of less space than the known exact algorithms such as PMS and are able to tackle instances with large values of d. ...
متن کاملAn experimental comparison of two different paradigms in Evolutionary Computation
The DNA motif finding problem is of great relevance in molecular biology. Motifs play an important role in all biological processes since they control the production of certain proteins by turning on and off the genes that codify them. These motifs consist of a short string of unknown length that can be located anywhere throughout the genome. This fact turns the problem much more difficult, so ...
متن کاملPairMotif: A New Pattern-Driven Algorithm for Planted (l, d) DNA Motif Search
Motif search is a fundamental problem in bioinformatics with an important application in locating transcription factor binding sites (TFBSs) in DNA sequences. The exact algorithms can report all (l, d) motifs and find the best one under a specific objective function. However, it is still a challenging task to identify weak motifs, since either a large amount of memory or execution time is requi...
متن کاملHybrid Gibbs-sampling algorithm for challenging motif discovery: GibbsDST.
The difficulties of computational discovery of transcription factor binding sites (TFBS) are well represented by (l, d) planted motif challenge problems. Large d problems are difficult, particularly for profile-based motif discovery algorithms. Their local search in the profile space is apparently incompatible with subtle motifs and large mutational distances between the motif occurrences. Here...
متن کامل